116 research outputs found

    Behavior Planning For Connected Autonomous Vehicles Using Feedback Deep Reinforcement Learning

    Full text link
    With the development of communication technologies, connected autonomous vehicles (CAVs) can share information with each other. We propose a novel behavior planning method for CAVs to decide actions such as whether to change lane or keep lane based on the observation and shared information from neighbors, and to make sure that there exist corresponding control maneuvers such as acceleration and steering angle to guarantee the safety of each individual autonomous vehicle. We formulate this problem as a hybrid partially observable Markov decision process (HPOMDP) to consider objectives such as improving traffic flow efficiency and driving comfort and safety requirements. The discrete state transition is determined by the proposed feedback deep Q-learning algorithm using the feedback action from an underlying controller based on control barrier functions. The feedback deep Q-learning algorithm we design aims to solve the critical challenge of reinforcement learning (RL) in a physical system: guaranteeing the safety of the system while the RL is exploring the action space to increase the reward. We prove that our method renders a forward invariant safe set for the continuous state physical dynamic model of the system while the RL agent is learning. In experiments, our behavior planning method can increase traffic flow and driving comfort compared with the intelligent driving model (IDM). We also validate that our method maintains safety during the learning process.Comment: conferenc

    Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios

    Full text link
    Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel safety shield for CAVs in challenging driving scenarios. The coordination mechanisms of the proposed MARL include information sharing and cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer as a spatial-temporal encoder that enhances the agent's environment awareness. The safety shield module with Control Barrier Functions (CBF)-based safety checking protects the agents from taking unsafe actions. We design a constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe and cooperative policies for CAVs. With the experiment deployed in the CARLA simulator, we verify the effectiveness of the safety checking, spatial-temporal encoder, and coordination mechanisms designed in our method by comparative experiments in several challenging scenarios with the defined hazard vehicles (HAZV). Results show that our proposed methodology significantly increases system safety and efficiency in challenging scenarios.Comment: This paper has been accepted by the 2023 IEEE International Conference on Robotics and Automation (ICRA 2023). 6 pages, 5 figure

    What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

    Full text link
    Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming that agents' policies are based on true states. Recent works have improved the robustness of MARL under uncertainties from the reward, transition probability, or other partners' policies. However, in real-world multi-agent systems, state estimations may be perturbed by sensor measurement noise or even adversaries. Agents' policies trained with only true state information will deviate from optimal solutions when facing adversarial state perturbations during execution. MARL under adversarial state perturbations has limited study. Hence, in this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to study the fundamental properties of MARL under state uncertainties. We prove that the optimal agent policy and the robust Nash equilibrium do not always exist for an SAMG. Instead, we define the solution concept, robust agent policy, of the proposed SAMG under adversarial state perturbations, where agents want to maximize the worst-case expected state value. We then design a gradient descent ascent-based robust MARL algorithm to learn the robust policies for the MARL agents. Our experiments show that adversarial state perturbations decrease agents' rewards for several baselines from the existing literature, while our algorithm outperforms baselines with state perturbations and significantly improves the robustness of the MARL policies under state uncertainties

    Shared Information-Based Safe And Efficient Behavior Planning For Connected Autonomous Vehicles

    Full text link
    The recent advancements in wireless technology enable connected autonomous vehicles (CAVs) to gather data via vehicle-to-vehicle (V2V) communication, such as processed LIDAR and camera data from other vehicles. In this work, we design an integrated information sharing and safe multi-agent reinforcement learning (MARL) framework for CAVs, to take advantage of the extra information when making decisions to improve traffic efficiency and safety. We first use weight pruned convolutional neural networks (CNN) to process the raw image and point cloud LIDAR data locally at each autonomous vehicle, and share CNN-output data with neighboring CAVs. We then design a safe actor-critic algorithm that utilizes both a vehicle's local observation and the information received via V2V communication to explore an efficient behavior planning policy with safety guarantees. Using the CARLA simulator for experiments, we show that our approach improves the CAV system's efficiency in terms of average velocity and comfort under different CAV ratios and different traffic densities. We also show that our approach avoids the execution of unsafe actions and always maintains a safe distance from other vehicles. We construct an obstacle-at-corner scenario to show that the shared vision can help CAVs to observe obstacles earlier and take action to avoid traffic jams.Comment: This paper gets the Best Paper Award in the DCAA workshop of AAAI 202

    Uncertainty Quantification of Collaborative Detection for Self-Driving

    Full text link
    Sharing information between connected and autonomous vehicles (CAVs) fundamentally improves the performance of collaborative object detection for self-driving. However, CAVs still have uncertainties on object detection due to practical challenges, which will affect the later modules in self-driving such as planning and control. Hence, uncertainty quantification is crucial for safety-critical systems such as CAVs. Our work is the first to estimate the uncertainty of collaborative object detection. We propose a novel uncertainty quantification method, called Double-M Quantification, which tailors a moving block bootstrap (MBB) algorithm with direct modeling of the multivariant Gaussian distribution of each corner of the bounding box. Our method captures both the epistemic uncertainty and aleatoric uncertainty with one inference pass based on the offline Double-M training process. And it can be used with different collaborative object detectors. Through experiments on the comprehensive collaborative perception dataset, we show that our Double-M method achieves more than 4X improvement on uncertainty score and more than 3% accuracy improvement, compared with the state-of-the-art uncertainty quantification methods. Our code is public on https://coperception.github.io/double-m-quantification.Comment: 6 pages, 3 figure

    Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications

    Full text link
    Reward design is a key component of deep reinforcement learning, yet some tasks and designer's objectives may be unnatural to define as a scalar cost function. Among the various techniques, formal methods integrated with DRL have garnered considerable attention due to their expressiveness and flexibility to define the reward and requirements for different states and actions of the agent. However, how to leverage Signal Temporal Logic (STL) to guide multi-agent reinforcement learning reward design remains unexplored. Complex interactions, heterogeneous goals and critical safety requirements in multi-agent systems make this problem even more challenging. In this paper, we propose a novel STL-guided multi-agent reinforcement learning framework. The STL requirements are designed to include both task specifications according to the objective of each agent and safety specifications, and the robustness values of the STL specifications are leveraged to generate rewards. We validate the advantages of our method through empirical studies. The experimental results demonstrate significant reward performance improvements compared to MARL without STL guidance, along with a remarkable increase in the overall safety rate of the multi-agent systems

    LawBench: Benchmarking Legal Knowledge of Large Language Models

    Full text link
    Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain

    Interaction between O-GlcNAc Modification and Tyrosine Phosphorylation of Prohibitin: Implication for a Novel Binary Switch

    Get PDF
    Prohibitin (PHB or PHB1) is an evolutionarily conserved, multifunctional protein which is present in various cellular compartments including the plasma membrane. However, mechanisms involved in various functions of PHB are not fully explored yet. Here we report for the first time that PHB interacts with O-linked β-N-acetylglucosamine transferase (O-GlcNAc transferase, OGT) and is O-GlcNAc modified; and also undergoes tyrosine phosphorylation in response to insulin. Tyrosine 114 (Tyr114) and tyrosine 259 (Tyr259) in PHB are in the close proximity of potential O-GlcNAc sites serine 121 (Ser121) and threonine 258 (Thr258) respectively. Substitution of Tyr114 and Tyr259 residues in PHB with phenylalanine by site-directed mutagenesis results in reduced tyrosine phosphorylation as well as reduced O-GlcNAc modification of PHB. Surprisingly, this also resulted in enhanced tyrosine phosphorylation and activity of OGT. This is attributed to the presence of similar tyrosine motifs in PHB and OGT. Substitution of Ser121 and Thr258 with alanine and isoleucine respectively resulted in attenuation of O-GlcNAc modification and increased tyrosine phosphorylation of PHB suggesting an association between these two dynamic modifications. Sequence analysis of O-GlcNAc modified proteins having known O-GlcNAc modification site(s) or known tyrosine phosphorylation site(s) revealed a strong potential association between these two posttranslational modifications in various proteins. We speculate that O-GlcNAc modification and tyrosine phosphorylation of PHB play an important role in tyrosine kinase signaling pathways including insulin, growth factors and immune receptors signaling. In addition, we propose that O-GlcNAc modification and tyrosine phosphorylation is a novel previously unidentified binary switch which may provide new mechanistic insights into cell signaling pathways and is open for direct experimental examination

    SYNTHESIS OF TETRAZINE-BASED COVALENT ORGANIC NETWORKS

    No full text
    After comparison of inorganic, hybrid and organic porous materials, tetrazine based organic porous material was chosen as a target material since it was rigid material providing micro- to mesoscale pores and could do post-synthetic modifications through inverse electron demand Diels-Alder Reaction with different dienophiles. Such networks own the potential as catalysts after modifying with different metal chelation sites. Here, three strategies were provided to synthesize such materials. During of construction of building blocks through double Diels-Alder Reaction, reactivities of tetrazines and dienophiles were studied. The coupling study provided the information of which type of coupling reaction could be used as well as how it was proceeded. The Direct synthesis of tetrazine-based network through formation of tetrazines then broadened the variety materials that could be synthesized.M.S. in Chemistry, May 201
    • …
    corecore